15 research outputs found
Automated Generation of Cross-Domain Analogies via Evolutionary Computation
Analogy plays an important role in creativity, and is extensively used in
science as well as art. In this paper we introduce a technique for the
automated generation of cross-domain analogies based on a novel evolutionary
algorithm (EA). Unlike existing work in computational analogy-making restricted
to creating analogies between two given cases, our approach, for a given case,
is capable of creating an analogy along with the novel analogous case itself.
Our algorithm is based on the concept of "memes", which are units of culture,
or knowledge, undergoing variation and selection under a fitness measure, and
represents evolving pieces of knowledge as semantic networks. Using a fitness
function based on Gentner's structure mapping theory of analogies, we
demonstrate the feasibility of spontaneously generating semantic networks that
are analogous to a given base network.Comment: Conference submission, International Conference on Computational
Creativity 2012 (8 pages, 6 figures
FNet: Mixing Tokens with Fourier Transforms
We show that Transformer encoder architectures can be massively sped up, with
limited accuracy costs, by replacing the self-attention sublayers with simple
linear transformations that "mix" input tokens. These linear transformations,
along with standard nonlinearities in feed-forward layers, prove competent at
modeling semantic relationships in several text classification tasks. Most
surprisingly, we find that replacing the self-attention sublayer in a
Transformer encoder with a standard, unparameterized Fourier Transform achieves
92-97% of the accuracy of BERT counterparts on the GLUE benchmark, but trains
nearly seven times faster on GPUs and twice as fast on TPUs. The resulting
model, FNet, also scales very efficiently to long inputs. Specifically, when
compared to the "efficient" Transformers on the Long Range Arena benchmark,
FNet matches the accuracy of the most accurate models, but is faster than the
fastest models across all sequence lengths on GPUs (and across relatively
shorter lengths on TPUs). Finally, FNet has a light memory footprint and is
particularly efficient at smaller model sizes: for a fixed speed and accuracy
budget, small FNet models outperform Transformer counterparts
LongT5: Efficient Text-To-Text Transformer for Long Sequences
Recent work has shown that either (1) increasing the input length or (2)
increasing model size can improve the performance of Transformer-based neural
models. In this paper, we present a new model, called LongT5, with which we
explore the effects of scaling both the input length and model size at the same
time. Specifically, we integrated attention ideas from long-input transformers
(ETC), and adopted pre-training strategies from summarization pre-training
(PEGASUS) into the scalable T5 architecture. The result is a new attention
mechanism we call {\em Transient Global} (TGlobal), which mimics ETC's
local/global attention mechanism, but without requiring additional side-inputs.
We are able to achieve state-of-the-art results on several summarization tasks
and outperform the original T5 models on question answering tasks.Comment: Accepted in NAACL 202
Functional Interpolation for Relative Positions Improves Long Context Transformers
Preventing the performance decay of Transformers on inputs longer than those
used for training has been an important challenge in extending the context
length of these models. Though the Transformer architecture has fundamentally
no limits on the input sequence lengths it can process, the choice of position
encoding used during training can limit the performance of these models on
longer inputs. We propose a novel functional relative position encoding with
progressive interpolation, FIRE, to improve Transformer generalization to
longer contexts. We theoretically prove that this can represent some of the
popular relative position encodings, such as T5's RPE, Alibi, and Kerple. We
next empirically show that FIRE models have better generalization to longer
contexts on both zero-shot language modeling and long text benchmarks
Generating Maps Using Markov Chains
In this paper we outline a method of procedurally generating maps using Markov Chains. Our method attempts to learn what makes a "good" map from a set of given human-authored maps, and then uses those learned patterns to generate new maps. We present an empirical evaluation using the game "Super Mario Bros.," showing encouraging results
Story Representation In Analogy-Based Story Generation In Riu
Computational analogy offers a promising direction to algorithmically generating stories, a key challenge in computational narrative. Since analogy methods are very sensitive to the story representation being used, this paper focuses on story representation for analogy-based story generation. Specifically, we analyze existing story representation formalisms and propose a new approach based on the cognitive semantics theory of force dynamics. Finally, we present the results of our analogy-based interactive narrative system, Riu, to illustrate the utility of our proposal. © 2010 IEEE